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Aspectual classification maps verbs to a small set of primitive categories in order to 
reason about time. This classification is necessary for interpreting temporal modifiers and 
assessing temporal relationships, and is therefore a required component for many natural 
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The problem of searching the elements of a set that are close to a given query element 
under some similarity criterion has a vast number of applications in many branches of 
computer science, from pattern recognition to textual and multimedia information 
retrieval. We are interested in the rather general case where the similarity criterion 
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The relationship between doctors and their patients is gaining more and more importance 
in the health care providing. It determines the compliance of the treatment and a part of 
the curative process. In the psychiatry the therapeutic relationship has even more power. 
Therefore having a general rule that could guide doctors towards a good relation with 
their patients would be very useful. This paper describes experiments in automated 
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XML documents have recently become ubiquitous because of their varied applicability in a 
number of applications. Classification is an important problem in the data mining domain, 
but current classification methods for XML documents use IR-based methods in which 
each document is treated as a bag of words. Such techniques ignore a significant amount 
of information hidden inside the documents. In this paper we discuss the problem of rule 
based classification of XML data by using frequent discrimina ... 
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Machine learning is the study of computational methods for improving performance by 
mechanizing the acquisition of knowledge from experience. Expert performance requires 
much domain-specific knowledge, and knowledge engineering has produced hundreds of 
AI expert systems that are now used regularly in industry. Machine learning aims to 
provide increasing levels of automation in the knowledge engineering process, replacing 
much time-consuming human activity with automatic tec ... 
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Recently, mining data streams with concept drifts for actionable insights has become an 
important and challenging task for a wide range of applications including credit card fraud 
protection, target marketing, network intrusion detection, etc. Conventional knowledge 
discovery tools are facing two challenges, the overwhelming volume of the streaming 
data, and the concept drifts. In this paper, we propose a general framework for mining 
concept-drifting data streams using weighted ensemble classifi ... 
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Probabilistic, or randomized, algorithms are fast becoming as commonplace as 
conventional deterministic algorithms. This survey presents five techniques that have 
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A precise analysis of partial match retrieval of multidimensional data is presented. The 
structures considered here are multidimensional search trees (k-d-trees) and digital tries 
(k-d-tries), as well as structures designed for efficient retrieval of information stored on 
external devices. The methods used include a detailed study of a differential system 
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A statistical profile summarizes the instances of a database. It describes aspects such as 
the number of tuples, the number of values, the distribution of values, the correlation 
between value sets, and the distribution of tuples among secondary storage units. 
Estimation of database profiles is critical in the problems of query optimization, physical 
database design, and database performance prediction. This paper describes a model of a 
database of profile, relates this model to estimating ... 
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Variable and feature selection have become the focus of much research in areas of 
application for which datasets with tens or hundreds of thousands of variables are 
available. These areas include text processing of internet documents, gene expression 
array analysis, and combinatorial chemistry. The objective of variable selection is three- 
fold: improving the prediction performance of the predictors, providing faster and more 
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Understanding distributed applications is a tedious and difficult task. Visualizations based 
on process-time diagrams are often used to obtain a better understanding of the 
execution of the application. The visualization tool we use is Poet, an event tracer 
developed at the University of Waterloo. However, these diagrams are often very complex 
and do not provide the user with the desired overview of the application. In our 
experience, such tools display repeated occurrences of non-trivial commun ... 
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Estimating the selectivity of multidimensional range queries over real valued attributes 
has significant applications in data exploration and database query optimization. In this 
paper, we consider the following problem: given a table of d attributes whose domain is 
the real numbers and a query that specifies a range in each dimension, find a good 
approximation of the number of records in the table that satisfy the query. The simplest 
approach to tackle this problem is to assume that the ... 
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Recent work has demonstrated the effectiveness of the wavelet decomposition in reducing 
large amounts of data to compact sets of wavelet coefficients (termed "wavelet 
synopses") that can be used to provide fast and reasonably accurate approximate query 
answers. A major shortcoming of these existing wavelet techniques is that the quality of 
the approximate answers they provide varies widely, even for identical queries on nearly 
identical values in distinct parts of the data. As a result, users ha ... 

Keywords: Wavelets, approximate query processing, data synopses, randomized 
rounding 



19 The„QuadLree.an 
Hanan Samet 

June 1984 ACM Computing Surveys (CSUR), volume 16 issue 2 
Publisher: ACM Press 

Full text available: ^.pdM-S.7 MB.) Additional Information: Ml citation, rejexejices, citlrigs, index terms 



20 Research papers: OLAP: SHIFT-SPLIT: I/O efficient maintenance of wavelet- 
transform^ 

^ Mehrdad Jahangiri, Dimitris Sacharidis, Cyrus Shahabi 

June 2005 Proceedings of the 2005 ACM SIGMOD international conference on 

Management of data 
Publisher: ACM Press 

Full text available: ^fidft561.J„9.KB) Additional Information: Mlcjtatjpn., abstract, references 

The Discrete Wavelet Transform is a proven tool for a wide range of database 
applications. However, despite broad acceptance, some of its properties have not been 
fully explored and thus not exploited, particularly for two common forms of 
multidimensional decomposition. We introduce two novel operations for wavelet 
transformed data, termed SHIFT and SPLIT, based on the properties of wavelet trees, 
which work directly in the wavelet domain. We demonstrate their significance and 
usefulness by anal ... 
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